6.9 Pumping Lemma

“The pumping lemma, that \(\forall \exists \forall\exists\forall\) monstrosity, is equivalent to the statement ‘every sufficiently-long walk over a finite graph contains a cycle.’ Not only is that intuitive, it’s obvious, unlike the pumping lemma.” - Hillel Wayne

We now consider a different method for proving a language \(L\) is not regular. It starts with a definition that may seem odd, but turns out to be useful.

Definition

Given a language \(L\) and a “pumping length” \(p\geq 1\), we say a string \(s\in L\) (with \(|s| \geq p\)) can be pumped if > >1. \(s = xyz\) (there’s a substring \(y\) inside \(s\)) > >2. \(y \not= \epsilon\) and \(|xy| \leq p\) (\(y\) is nonempty and not too far from the front) > >3. \(xy^iz \in L\) for every \(i\geq 0\) (\(xz\), \(xyz\), \(xyyz\), \(xyyyz\), … are all in \(L\))

Example

If \(L := \{\ a^nb^m\ |\ n,m\geq 0\ \}\)
and \(p := 3\), then \(~~s:= \mathrm{abbbbbb~~}\) can be pumped. > There are many ways to show this, including > >- Set \(x:=\mathrm{a}\), > \(y:=\mathrm{bb}\), and > \(z:=\mathrm{bbbb}\) >- Set \(x:=\mathrm{ab}\), > \(y:=\mathrm{b}\), and > \(z:=\mathrm{bbbb}\) >- Set \(x:=\mathrm{a}\), > \(y:=\mathrm{b}\), and > \(z:=\mathrm{bbbbb}\) >- Set \(x:=\epsilon\), \(y:=\mathrm{a}\), > and \(z:=\mathrm{bbbbbb}\) > In all cases \(xy^iz\in L\) for every \(i\geq 0\).

Example

If \(L := \{\ w\in\{0,1\}^*\ |\ w \mathrm{\ has\ even\ 0's\ and\ odd\ 1's}\ \}\) and \(p := 4\)
then \(~~s := 01000100111~~\) can be pumped. > There is exactly one way to do so, namely > >- Set \(x:=\mathrm{01}\), > \(y:=\mathrm{00}\), and > \(z:=\mathrm{0100111}\) > Then \(xy^iz\in L\) for every \(i\geq 0\).

Example

If \(L\) is the set of strings of balanced parentheses, and \(p := 4\)
then \(~~s := (()())~~\) can be pumped. > There are two ways to do so > >- Set \(x:=\mathrm{(}\), > \(y:=\mathrm{()}\), and > \(z:=\mathrm{())}\) >- Set \(x:=\mathrm{((}\), > \(y:=\mathrm{)(}\), and > \(z:=\mathrm{)))}\) > In both caes, \(xy^iz\in L\) for every \(i\geq 0\).

The Pumping Lemma

The utility of this definition comes from the following fact:

Lemma

Pumping Lemma

If \(L\) is a regular language, then there exists \(p\geq 1\) such that every string \(s\in L\) with \(|s|\geq p\) can be pumped. > More importantly the contrapositive holds; if for every \(p\geq\) there is a string \(s\in L\) with \(|s|\geq p\) that cannot be pumped, then \(L\) is not regular.

That is, in a regular language, every string in that language (larger than some fixed size) can be pumped.

Equivalently, if a language contains arbitrarily long unpumpable strings (i.e., unpumpable strings with more than \(p\) characters for any choice of \(p\)), then that language cannot be regular. We can use this fact to prove that many nonregular languages are nonregular.

(Warning: The Pumping Lemma does not say “if and only if.” The presence of long unpumpable strings proves a language is not regular, but the absence of such strings tells us nothing. There are regular and nonregular languages where all long strings are pumpable.)

Example

The language \(L := \{\ a^nb^n\ |\ n \geq 0\ \ \}\) is not regular. > Proof.
Let \(p \geq 1\) be given.
Put \(s := a^pb^p\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\).
> Then \(y\) must consist of one or more a’s.
> So \(xy^2z\) will have more a ’s than b’s, and \(xy^2z\not\in L\).
Thus no partition works to pump \(s\).
>By the Pumping Lemma, \(L\) is not regular.

Example

The language \(L := \{\ a^nb^n\ |\ n \geq 0\ \ \}\) is not regular. > Alternate Proof (with a less clever choice of \(s\)).
Let \(p \geq 1\) be given.
Put \(s := a^{\lceil p/2\rceil}b^{\lceil p/2\rceil}\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\). > > There are three cases to consider. > >- If \(y\) consists of one or more a’s, then \(xy^2z\) will have more a ’s than b’s, and \(xy^2z\not\in L\) >- If \(y\) consists of one or more b’s, then \(xy^2z\) will have more b’s than a’s, and \(xy^2z\not\in L\) >- If \(y\) consists of both a’s and b’s, then \(xy^2z\) will not have all its a’s before all its b’s, and \(xy^2z\not\in L\) > Thus no partition works to pump \(s\).
By the Pumping Lemma, \(L\) is not regular.

Example

The language \(L := \{\ w\in\{a,b\}^{*}\ \ |\ w \mathrm{\ has\ the\ same\ number\ of\\ a's\ and\ b's} \}\) is not regular. > Proof.
Let \(p \geq 1\) be given.
Put \(s := a^pb^p\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\).
> Then \(y\) must consist of one or more a’s.
> So \(xy^2z\) will have more a ’s than b’s, and \(xy^2z\not\in L\).
Thus no partition works to pump \(s\).
By the Pumping Lemma, \(L\) is not regular.

Example

The language \(L := \{\ 0^n1^m\ |\ n\leq m\}\) is not regular. > Proof.
Let \(p \geq 1\) be given.
Put \(s := 0^p1^p\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\).
> Then \(y\) must consist of one or more 0’s.
> So \(xy^2z\) will have more 0 ’s than 1’s, and \(xy^2z\not\in L\).
Thus no partition works to pump \(s\).
By the Pumping Lemma, \(L\) is not regular.

Example

The language \(L := \{\ 0^n1^m\ |\ n\geq m\}\) is not regular. > Proof.
Let \(p \geq 1\) be given.
Put \(s := 0^p1^p\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\).
> Then \(y\) must consist of one or more 0’s.
> So \(xz\) will have fewer 0 ’s than 1’s, and \(xz\not\in L\).
Thus no partition works to pump \(s\).
By the Pumping Lemma, \(L\) is not regular.

Example

The language \(L := \{\ a^nb^nc^n\ |\ n \geq 0\ \}\) is not regular. > Proof.
Let \(p \geq 1\) be given.
Put \(s := a^pb^pc^p\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\).
> Then \(y\) must consist of one or more a’s.
> So \(xy^2z\) will have more a ’s than b’s or c’s, and \(xy^2z\not\in L\).
Thus no partition works to pump \(s\).
By the Pumping Lemma, \(L\) is not regular.

Example

The language \(L := \{\ w\in\{a,b\}^{*}\ \ |\ w \mathrm{\ is\ a\ palindrome}\ \}\) is not regular. > Proof.
Let \(p \geq 1\) be given.
Put \(s := a^pba^p\). Note that \(s\in L\) and \(|s|\geq p\).
To show: \(s\) is not pumpable.
Let \(s := xyz\) be any partition of \(s\) into three parts with \(y\not=\epsilon\) and \(|xy|\leq p\).
> Then \(y\) must consist of one or more a’s.
> So \(xy^2z\) will have more a ’s before the b than after, and \(xy^2z\not\in L\).
Thus no partition works to pump \(s\).
By the Pumping Lemma, \(L\) is not regular.

Conclusion

Extra: Justifying the Pumping Lemma

We don’t need to know the proof of the Pumping Lemma in order to use it effectively. But it is not hard to see why it is true, once you know the trick.

Lemma

Pumping Lemma

If \(L\) is a regular language, then there exists \(p\geq 1\) such that every string \(s\in L\) with \(|s|\geq p\) can be pumped. > Proof.
Suppose \(L\) is a regular language.
Then there is a DFA whose language is \(L\).
Let \(p\) be the number of states in that DFA.
Let \(s\) be an arbitrary string in \(L\) such that \(|s|\geq p\). > > Obviously the DFA must accept \(s\).

But running \(s\) through the DFA visits at least \(p+1\) states, which means that some state was visited at least twice (and at least one such repetition happened within the first \(p\) characters of the input).

That is, the string \(s\) went through a loop on the way from the start state to an accepting state.

Call the part of \(s\) before the first loop \(x\), the part of \(s\) that takes us around the first loop \(y\), and call the rest of \(s\) \(z\).

Then the strings \(xy^iz\) take us through the DFA from the start state to the same accepting state for every \(i\geq 0\), because these strings take the same path as \(s\) except they avoid the loop entirely, or take the loop multiple times.

Since all these strings are accepted, they must all belong to \(L\).
Thus, we have shown that \(s\) is pumpable. > Since \(s\) was an arbitrary string in \(L\) with length \(\geq p\), all such strings are pumpable.

Previous: 6.8 Non-Regular Languages

Next: 6.10 CFLs